Fix automatic SystemVM template download to S3 secondary storage#12426
Fix automatic SystemVM template download to S3 secondary storage#12426Damans227 wants to merge 7 commits intoapache:4.20from
Conversation
|
@blueorangutan package |
|
@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16368 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 4.20 #12426 +/- ##
============================================
+ Coverage 16.23% 16.26% +0.02%
- Complexity 13382 13432 +50
============================================
Files 5657 5660 +3
Lines 498999 499991 +992
Branches 60566 60713 +147
============================================
+ Hits 81035 81333 +298
- Misses 408928 409585 +657
- Partials 9036 9073 +37
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@blueorangutan test |
|
@nvazquez a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian Build Failed (tid-15187) |
|
|
||
| client.setEndpoint(clientOptions.getEndPoint()); | ||
| // Enable path-style access for S3-compatible storage | ||
| client.setS3ClientOptions(com.amazonaws.services.s3.S3ClientOptions.builder().setPathStyleAccess(true).build()); |
There was a problem hiding this comment.
So, when debugging the issue... I noticed that the connection to MinIO failed at the time of template upload, with an error that looked something like:
UnknownHostException: cloudstack-secondary.10.0.34.157 i.e. the SDK was trying to connect to the http://cloudstack-secondary.10.0.34.157:9000/... which is the virtual-hosted style (refer: virtual style vs path style syntax for s3).
Looking at other S3-compatible plugins in CloudStack, I found that both CephObjectStoreDriverImpl and CloudianHyperStoreUtil use enablePathStyleAccess() to get path-style URLs http://10.0.34.157:9000/cloudstack-secondary/... i.e.
AmazonS3 client = AmazonS3ClientBuilder.standard()
.enablePathStyleAccess()
.withCredentials(new AWSStaticCredentialsProvider(new BasicAWSCredentials(accessKey, secretKey)))
.withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration(url, "auto"))
.build();
Applying the same fix here worked. The AWS SDK documentation confirms that path-style access must be explicitly enabled for S3-compatible stores.
There was a problem hiding this comment.
Pull request overview
This PR fixes an issue where SystemVM templates fail to automatically download to S3 secondary storage when adding it to a CloudStack zone. The root cause was that S3 stores use REGION scope, but the endpoint selector only returned LocalHostEndpoint for ZONE-scoped stores with null scope IDs.
Changes:
- Modified endpoint selection logic to support
REGION-scoped stores for SYSTEM template downloads - Added null safety checks for data stores without URLs (e.g., S3 object stores)
- Enabled path-style access for S3-compatible storage systems like MinIO
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| engine/storage/src/main/java/org/apache/cloudstack/storage/endpoint/DefaultEndPointSelector.java | Extended condition to allow LocalHostEndpoint for REGION-scoped stores with SYSTEM templates |
| services/secondary-storage/controller/src/main/java/org/apache/cloudstack/secondarystorage/SecondaryStorageManagerImpl.java | Added null checks to skip data stores without URLs when building secondary storage addresses |
| services/secondary-storage/controller/src/test/java/org/apache/cloudstack/secondarystorage/SecondaryStorageManagerImplTest.java | Added comprehensive test coverage for null handling in data store processing |
| utils/src/main/java/com/cloud/utils/storage/S3/S3Utils.java | Enabled path-style access for S3-compatible storage systems |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Enable path-style access for S3-compatible storage | ||
| client.setS3ClientOptions(com.amazonaws.services.s3.S3ClientOptions.builder().setPathStyleAccess(true).build()); |
There was a problem hiding this comment.
Path-style access is being enabled unconditionally for all S3 endpoints, including AWS S3 which deprecated path-style access in favor of virtual-hosted-style. This could cause compatibility issues with AWS S3. Consider making path-style access configurable through ClientOptions, or only enabling it when a custom endpoint is detected (non-AWS S3).
There was a problem hiding this comment.
hmm, path-style access code is already inside the if (StringUtils.isNotBlank(clientOptions.getEndPoint())) block - which means path-style access is only enabled when a custom endpoint is specified. The code is already doing what Copilot is asking for.
| if (tmplInfo.getTemplateType() == TemplateType.SYSTEM && | ||
| (store.getScope().getScopeType() == ScopeType.REGION || | ||
| (store.getScope().getScopeType() == ScopeType.ZONE && store.getScope().getScopeId() == null))) { |
There was a problem hiding this comment.
The modified endpoint selection logic for REGION-scoped SYSTEM templates lacks test coverage. Consider adding unit tests in the engine/storage module to verify that LocalHostEndpoint is correctly returned for REGION-scoped stores with SYSTEM templates, similar to the existing test coverage in SecondaryStorageManagerImplTest.
|
@blueorangutan package |
|
@kiranchavala a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16615 |
There was a problem hiding this comment.
Please find the issues that i observed
Tested with the pr packages and on oracle linux 8.6
- Unable to use ceph s3 storage , got the following exception is logs
2026-01-30 07:52:30,770 DEBUG [o.a.c.s.r.NfsSecondaryStorageResource] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) Executing command "DownloadCommand" [{"hvm":false,"description":"SystemVM Template (KVM)","checksum":"c059b0d051e0cd6fbe9d5d4fc40c7e5d","maxDownloadSizeInBytes":53687091200,"id":3,"resourceType":"TEMPLATE","installPath":"template/tmpl/1/3/routing-3","_store":{"id":1,"uuid":"fdb66906-6b57-4e32-a7df-cbb93a917fce","accessKey":"AFYT2BKNI1U8T6DY6435","secretKey":"kia5kyDAZuP7QjwxNmhVAEE5l5dzsSJWbxtSXCIA","endPoint":"https://10.0.33.100","bucketName":"testbucket","httpsFlag":true,"created":"Jan 30, 2026, 7:52:20 AM","enableRRS":false,"maxSingleUploadSizeInBytes":5368709120},"followRedirects":false,"url":"http://download.cloudstack.org/systemvm/4.6/systemvm64template-4.6.0-kvm.qcow2.bz2","format":"QCOW2","accountId":1,"name":"routing-3","contextMap":{},"wait":0,"bypassHostMaintenance":false}].
2026-01-30 07:52:30,798 DEBUG [c.c.u.n.HTTPUtils] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) Initializing new HttpMethodRetryHandler with retry count 5
2026-01-30 07:52:30,892 INFO [c.c.s.t.S3TemplateDownloader] (pool-10-thread-1:[ctx-2cad9dd2]) (logid:1b4a38df) Starting download from http://download.cloudstack.org/systemvm/4.6/systemvm64template-4.6.0-kvm.qcow2.bz2 to S3 bucket testbucket and size (304.60 MB) 319401369 bytes
2026-01-30 07:52:30,897 DEBUG [c.c.u.s.S.S3Utils] (pool-10-thread-1:[ctx-2cad9dd2]) (logid:1b4a38df) Sending stream as S3 object template/tmpl/1/3/routing-3/systemvm64template-4.6.0-kvm.qcow2.bz2 in bucket testbucket using PutObjectRequest
2026-01-30 07:52:31,217 DEBUG [c.c.u.s.S.S3Utils] (pool-10-thread-1:[ctx-2cad9dd2]) (logid:1b4a38df) Creating S3 client with configuration: [protocol: https, signer: null, connectionTimeOut: 10000, maxErrorRetry: -1, socketTimeout: 50000, useTCPKeepAlive: null, connectionTtl: null]
2026-01-30 07:52:31,468 DEBUG [c.c.u.s.S.S3Utils] (pool-10-thread-1:[ctx-2cad9dd2]) (logid:1b4a38df) Setting the end point for S3 client with access key AFYT2BKNI1U8T6DY6435 to https://10.0.33.100.
2026-01-30 07:52:33,803 INFO [o.a.c.s.i.BaseImageStoreDriverImpl] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) Updating store ref entry for template Template {"format":"QCOW2","id":3,"name":"SystemVM Template (KVM)","uniqueName":"routing-3","uuid":"56911227-fd0c-11f0-9d05-1e00e00002fb"}
2026-01-30 07:52:33,817 WARN [c.c.a.AlertManagerImpl] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) alertType=[28] dataCenterId=[1] podId=[null] clusterId=[null] message=[Failed to register template: 56911227-fd0c-11f0-9d05-1e00e00002fb with error: ].
2026-01-30 07:52:33,825 WARN [c.c.a.AlertManagerImpl] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) No recipients set in global setting 'alert.email.addresses', skipping sending alert with subject [Failed to register template: 56911227-fd0c-11f0-9d05-1e00e00002fb with error: ] and content [Failed to register template: 56911227-fd0c-11f0-9d05-1e00e00002fb with error: ].
2026-01-30 07:52:33,825 ERROR [o.a.c.s.i.BaseImageStoreDriverImpl] (pool-11-thread-1:[ctx-a125d9e1]) (logid:7259f280) Failed to register template: 56911227-fd0c-11f0-9d05-1e00e00002fb
with error:
- Used Minio s3 storage, the systemvm template got registered successfully
But it was of "systemvm64template-4.6.0-kvm.qcow2.bz2" and the systemvm were struck in starting state
- When the primary storage is of zone scope,
logs
2026-01-30 09:32:23,838 DEBUG [o.a.c.s.v.VolumeServiceImpl] (Work-Job-Executor-7:[ctx-45ba2dab, job-40/job-47, ctx-3f578420]) (logid:17e47580) Found template Template {"format":"QCOW2","id":3,"name":"SystemVM Template (KVM)","uniqueName":"routing-3","uuid":"56911227-fd0c-11f0-9d05-1e00e00002fb"} in storage pool StoragePool {"id":1,"name":"pri","poolType":"NetworkFilesystem","uuid":"cfc7f591-fc1e-36dd-b2c5-dc6712acf57e"} with VMTemplateStoragePool: TmplPool[3-3-1-null]
2026-01-30 09:32:23,839 DEBUG [o.a.c.s.v.VolumeServiceImpl] (Work-Job-Executor-7:[ctx-45ba2dab, job-40/job-47, ctx-3f578420]) (logid:17e47580) Acquire lock on VMTemplateStoragePool 3 with timeout 3600 seconds
- When the primary storage is of cluster scope,
logs
2026-01-30 09:47:14,659 DEBUG [o.a.c.s.c.m.StorageCacheManagerImpl] (Work-Job-Executor-6:[ctx-fdef51b0, job-62/job-64, ctx-9c2c6347]) (logid:11a5fb7e) waiting cache copy completion type: template, id: 3, lock: 638897157
2026-01-30 09:47:24,659 DEBUG [o.a.c.s.c.m.StorageCacheManagerImpl] (Work-Job-Executor-6:[ctx-fdef51b0, job-62/job-64, ctx-9c2c6347]) (logid:11a5fb7e) waken up
2026-01-30 09:47:24,661 DEBUG [o.a.c.s.c.m.StorageCacheManagerImpl] (Work-Job-Executor-6:[ctx-fdef51b0, job-62/job-64, ctx-9c2c6347]) (logid:11a5fb7e) waiting cache copy completion type: template, id: 3, lock: 638897157
Reproduced this issue around Ceph S3 being used as secondary storage. I too get the empty error - logs show logs: |
|
@kiranchavala Regarding the Ceph issue, your logs show HTTPS endpoint ( However, testing with with HTTP The
Could you try with HTTP instead? Is there a reason HTTPS was configured? |
|
Thanks @Damans227 The issue is solved when I point it to a HTTP s3 link Create a zone from scratch , in the zone creation wizard add s3 as the secondary storage
Will check again with a fresh deployment |
Got it. Thanks for checking. |
|
Also @Damans227 If possible can you try to improve the cloudstack UI so that bucket details are show in the secondary storage details Currently there is no way to identify the bucket and s3 URL |
Yea, I’ll look into it. Also noticed that when an exception happens during template registration, the error string is coming through empty. I’ll try to fix that as well. |
|
@blueorangutan package |
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 16690 |
|
@blueorangutan package |
|
@Damans227 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16691 |
|
@kiranchavala Bucket details shown in the secondary storage details: |



Description
This PR fixes an issue where the SystemVM template is not automatically downloaded to S3 secondary storage when adding it to a CloudStack zone.
Root Cause:
S3 stores use
REGIONscope butDefaultEndPointSelectoronly returnedLocalHostEndpointforZONEscope, so no endpoint was found to download the SystemVM template.Fix:
Allow
LocalHostEndpointto handle SYSTEM template downloads forREGION-scoped stores, plus added null checks for S3 stores without URLs and enabled path-style access for S3-compatible storage.Fixes: #9002
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
Broken:
Fixed:
Screencast.from.2026-01-14.13-52-40.mp4
How Has This Been Tested?
Test Environment:
Test Steps: